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Abstract For individual events quantum mechanics makes only probabilistic pre- 
dictions. Can one go beyond quantum mechanics in this respect? This question 
has been a subject of debate and research since the early days of the theory. Ef- 
forts to construct deeper, realistic, level of physical description, in which individ- 
ual systems have, like in classical physics, preexisting properties revealed by mea- 
surements are known as hidden-variable programs. Demonstrations that a hidden- 
variable program necessarily requires outcomes of certain experiments to disagree 
with the predictions of quantum theory are called "no-go theorems". The Bell theo- 
rem excludes local hidden variable theories. The Kochen-Specker theorem excludes 
noncontextual hidden variable theories. In local hidden-variable theories faster-that- 
light-influences are forbidden, thus the results for a given measurement (actual, or 
just potentially possible) are independent of the settings of other measurement de- 
vices which are at space-like separation. In noncontextual hidden-variable theories 
the predetermined results of a (degenerate) observable are independent of any other 
observables that are measured jointly with it. 

It is a fundamental doctrine of quantum information science that quantum com- 
munication and quantum computation outperforms their classical counterparts. If 
this is to be true, some fundamental quantum characteristics must be behind better- 
than-classical performance of information processing tasks. This chapter aims at 
establishing connections between certain quantum information protocols and foun- 
dational issues in quantum theory. After a brief discusion of the most common mis- 
interpretations of Bell's theorem and a discussion of what its real me aning is, it 



Caslav Brukner 

Faculty of Physics, University of Vienna, Boltzmanngasse 5, 1090 Vienna; 

Institute of Quantum Optics and Quantum Information, Austrian Academy of Sciences, Boltzman- 
ngasse 3, 1090 Vienna 

e-mail: caslav . brukner @ un"i vie . ac . at I 
Marek Zukowski 

Institute for Theoretical Physics and Astrophysics, University of Gdansk, 80-952 Gdansk, Poland 
e-mail: marek . zukowski@univie . ac . at 



1 



2 



Caslav Brukner and Marek Zukowski 



will be demonstrated how quantum contextuality and violations of local realism can 
be used as useful resources in quantum information applications. In any case, the 
readers should bear in mind that this chapter is not a review of the literature of the 
subject, but rather a quick introduction. 



1 Introduction 

Which quantum states are useful for quantum information processing? All non- 
separable states? Only distillable non-separable states? Only those which violate 
constraints imposed by local realism? Entanglement is the most distinct feature of 
quantum physics with respect to the classical world [1|. On one hand, entangled 
states violate Bell inequalities, and thus rule out local realistic explanation of quan- 
tum mechanics. On the other hand, they enable certain communication and com- 
putation tasks to have an efficiency not achievable by the laws of classical physics. 
Intuition suggests that these two aspects, the fundamental one, and the one associ- 
ated with applications, are intimately linked. It is natural to assume that the quantum 
states which allow the no-go theorems of quantum theory, such as Kochen-Specker, 
Bell's or Greenberger-Horne-Zeilinger theorem should also be useful for quantum 
information processing. If this were not true, one might expect that the efficiency of 
quantum information protocols could be simulatable by classical, essentially local 
realistic or noncontextual models, and thus achievable already via classical means. 
This intuitive reasoning is supported by the results of, for example, Acin et. al Q: 
violation of a Bell's inequality is a criterium for the security of quantum key distri- 
bution protocols. Also it was shown that violation of Bell's inequalities by a quan- 
tum state implies that pure-state entanglement can be distilled from it and that 
Bell's inequalities are related to optimal solutions of quantum state targeting |4j]. 
In this overview we will give other examples that demonstrate the strong link be- 
tween fundamental features of quantum states and their applicabilities in quantum 
information protocols, such as in quantum communication complexity problems, 
quantum random access, or certain quantum games. 



2 Quantum predictions for two qubits systems 

To set the stage for our story let us first describe two-qubits systems in full detail. 

We shall present predictions for all possible local yes-no experiments on two 
spin- 1/2 systems(in modern terminology, qubits) for all possible quantum states, 
i.e. from the pure maximally entangled singlet state (or the Bohm-EPR state), via 
factorizable (i.e. non-entangled) states, up to any mixed state. This will enable us 
to reveal the distinguishing traits of the quantum predictions for entangled states of 
the simplest possible compound quantum system. The formalism can be applied to 
any system consisting of two subsystems, such that each of them is described by a 



Bell's Inequalities: Foundations and Quantum Communication 3 

two dimensional Hilbert space. We choose the spin-1 /2 convention to simplify the 
description. 



2.1 Pure states 

An important tool simplifying the analysis of the pure states of two subsystems is 
the so-called Schmidt decomposition. 

2.1.1 Schmidt decomposition 

For any nonfactorizable (i.e., entangled) pure state, of pair of quantum sub- 
systems, one described by a Hilbert space of dimension N, the other by space of 
dimension M, N < M, it is always possible to find preferred bases, one basis for 
the first system, another one for the second, such that the state becomes a sum of 
bi-orthogonal terms, i.e. 

i V /)=£c,i«,) 1 i^) 2 (i) 

with „{xi\xj) n = 8jj, for x = a,b and n = 1,2. It is important to stress that the appro- 
priate single subsystem bases, here | a,-^ and | &/),> depend upon the state that we 
want to Schmidt-decompose. 

The ability to Schmidt decompose the state is equivalent to a well known fact 
form matrix algebra, that any N x M matrix A can be always put into a diagonal 
form D, by applying a pair of unitary transformations: Ylj=i Hk=i UijAjkUki = D/dn- 

The interpretation of the above formula could be put as follows. If the quantum 
pure state of two systems is non-factorizable, then there exist a pair of local observ- 
ables (for system 1 with eigenstates and for system 2 with eigenstates \bj)) such 
that the results of their measurement are perfectly correlated. 

The method of Schmidt decomposition allows one to put every pure normalized 
state of two spins into 

| y) = cos a/2 1 +) j | +) 2 + sin a/2 1 -) ! | -} 2 . (2) 

Schmidt decomposition generally allows the coefficients to be real. This is achiev- 
able via trivial phase transformations of the preferred bases. 



2.2 Arbitrary states 

Systems can be in mixed states. Such states describe situations in which there does 
not exist any nondegenerate observable for which measurement result is determin- 
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istic. This is the case when the system can be with various probabilities P(x) > 
in some non-equivalent states |y/(x)), with Y*P{ X ) = Mixed states are repre- 
sented by self adjoint density non-negative operators p = £ v P(x)| \j/(x)) (y(x)\. As 
Tr|VA(jc))(vA(jc)| = 1 one has Trp = 1. 

Let us present in detail properties of mixed states of the two spin- 1/2 systems. 
Any self adjoint operator for one spin- 1/2 particle is a linear combination of the 
Pauli matrices a,, i= 1,2,3 and the identity operator, do = 1, with real coefficients. 
Thus, any self adjoint operator in the tensor product of the two spin- 1/2 Hilbert 
spaces, must be a real linear combination of all possible products of the operators 
C^CTy, where the Greek indices run from to 3, and the superscripts denote the 
particle. As the trace of (7, is zero we arrive at the following form of the general 
density operator for two spin 1/2 systems: 

P = {(4 1) <$+*-<P ) <& + <£ ) '-o® + t T nm a^aiA, (3) 

m,n=l / 

where, r, s are real three dimensional vectors and r • a = Y.j=i r i&i- We shall use the 
tensor product symbol <E> only sparingly, only whenever it is deemed necessary. The 
condition Trp = 1 is satisfied thanks to the first term. 

Since the average of any real variable which can have only two values +1 and —1 
cannot be larger than 1 and less than —1, the real coefficients T mn satisfy relations 

-i<r BB1 = , &po^ 1) o4 2) <i, (4) 

and they form a matrix which will be denoted by T. One also has 

-l<r n = Trpaji 1) < 1, (5) 

and 

-l<s m = Trp a„ ( , 2) <l. (6) 



2.2.1 Reduced density matrices for subsystems 

A reduced density matrix represents the local state of a compound system. If we 
have two subsystems, then the average of any observable which pertains to the first 
system only, i.e. of the form A <8 1, where 1 is the identity operation for system 
2, can be expressed as follows Tri2(A <E> lp) — Ti\ [A(Tr2p)]. Here Tr,- represents a 
trace with respect to system ;. As trace is a basis independent notion, one can always 
choose a factorizable basis, and therefore split the trace calculation into two stages. 
The reduced one particle matrices for spins 1 /2, are of the following form: 

p 1 =Tr 2 p = i(l+r-cjW), (7) 
p 2 = Tnp = i(l + s-a( 2 )). (8) 
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with r and s the two local Bloch vectors of the spins. 

Let us denote the eigenvectors of the spin projection along direction a of the first 
spin as: | \j/(±l,a)) l . They are defined by the relation 

a-(j( 1 )|v/(±l,a)> 1 = ±l|v/(±l,a)) 1 , (9) 

where a is a real vector of unit length (i.e. a ■ C 1 is a Pauli operator in the direction 
of a). The probability of a measurement of this Pauli observable to give a result ±1 
is given by 

P(±l|a)i =Kipm$ ±1) = ^(l±a-r), (10) 
and it is positive for arbitrary a, if and only if, the norm of r satisfies 

|r|<l. (11) 

Here tt£) is the projector | y(±l,a)} u (y(±l,a)|. 



2.3 Local measurements on two spins 

The probabilities for local measurements to give the result / = ± 1 for particle 1 and 
the result m = ±1 for particle 2, under specified local settings, a and b respectively, 
of the quantization axes are given by: 

P(/,m|a,b)i i2 = Trp^J ) ^ ) = i (l +/a-r+mb-s + /ma-fb) , (12) 

where Tb denotes the transformation of the column vector b by the matrix T (we 
treat here Euclidean vectors as column matrices). 

One can simplify all these relations by performing suitable local unitary trans- 
formations upon each of the subsystems, i.e. via factorizable unitary operators 
{/(i){/(2) j t j s we jj k nown that any unitary operation upon a spin 1/2 is equiva- 
lent to a three dimensional rotation in the space of Bloch vectors. In other words, 
for any real vector w 

f/(6)w-C7f/(6) t = (Ow)-ct, (13) 

where O is the orthogonal matrix of the rotation. If the density matrix is subjected to 
such a transformations on either spins subsystem, i.e. to the U 1 (Oi)i/ 2 (02) trans- 
formation, the parameters r, s and T transform themselves as follows 

r=6ir, 

s' = 6 2 s, 

T^OjfOj. (14) 
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Thus, for an arbitrary state, we can always choose such factorizable unitary trans- 
formation that the corresponding rotations (i.e. orthogonal transformations) will di- 
agonalize the correlation tensor (matrix) T. This can be seen as another application 
of Schmidt's decomposition, this time in case of second rank tensors. 

The physical interpretation of the above is that one can always choose two (local) 
systems of coordinates, one for the first particle, the other for the second particle, in 
such a way that the T matrix will be diagonal. 

Let us note that one can decompose the two spin density matrix into: 

J 3 

P = Pi®P2 + ^ £ C nm a^al, (15) 

m,n— 1 

i.e., it is a sum of the product of the two reduced density matrices and a term C = 
T — rs T which is responsible for correlation effects. 

Any density operator satisfies the inequality ^ < Trp 2 < 1, where d is the di- 
mension of the Hilbert space in which it acts, i. e. of the system it describes. The 
value of Trp 2 is a measure of the purity of the quantum state. It is equal to 1 only 
for single dimensional projectors, i.e. the pure states. In the studied case one must 
have 

|r| 2 + |s| 2 + ||T|| 2 <3. (16) 

For pure states, represented by Schmidt decomposition d2j, T is diagonal with 
entries T xx = — sin a, T yy = sin a and T zz = 1, whereas r = s, and their z component 
is non-zero: s z = m- = cos a. Thus in case of a maximally entangled states T has 
only diagonal entries equal to +1 and —1. In the case of the singlet state, 

W = ^(l+>l|->2-|->ll+>2). d7) 

which can be obtained from eq. (f2j), by putting a = — % and rotating one of the 
subsystems such that |+) and |— ) interchange (This is equivalent to a 180 degrees 
rotation with respect to the axis x\ See above (fPH) ). the diagonal elements of the 
correlation tensor are all — 1 . 



3 Einstein-Podolsky-Rosen Experiment 

In their seminal 1935 paper j5) entitled "Can quantum-mechanical description of 
physical reality be considered complete? " Einstein, Podolsky and Rosen (EPR) con- 
sider quantum systems consisting of two particles such that, while neither position 
nor momentum of either particle is well defined, both the difference of their posi- 
tions and the sum of their momenta are both precisely defined. It then follows that 
measurement of either position or momentum performed on, say, particle 1 immedi- 
ately implies for particle 2 a precise position or momentum respectively even when 



Bell's Inequalities: Foundations and Quantum Communication 



7 



the two particles are separated by arbitrary distances without any actual interaction 
between them. 

We shall present the EPR argumentation for incompleteness of quantum mechan- 
ics in the language of spins 1 /2. This has been done by Bohm in 1952. A two qubit 
example of an EPR state is the singlet state (T7\ . Properties of a singlet can be in- 
ferred without mathematical considerations given above. This is a state of zero total 
spin. Thus measurements of the same component of the two spins must always give 
opposite values - this is simply the conservation of angular momentum at work. In 
terms of the language od Pauli matrices the product of the local results is then al- 
ways — 1. We have (infinitely many) prefect (anti-)correlations. We assume that the 
two spins are very far away, but nevertheless in the singlet state. 

After the translation into the Bohm's example EPR argument runs as follows. 
Here are their premises: 

1 . Perfect correlations If whatever spin components of particles 1 and 2, then with 
certainly the outcomes will be found to be perfectly anti-correlated. 

2. Locality: "Since at the time of measurements the two systems no longer interact, 
no real change can take place in the second system in consequence of anything 
that may be done to the first system." 

3. Reality: "If, without in any way disturbing a system, we can predict with certainty 
(i.e., with probability equal to unity) the value of a physical quantity, then there 
exists an element of physical reality corresponding to this physical quantity." 

4. Completeness: "Every element of the physical reality must have a counterpart in 
the [complete] physical theory." 

In contrast to the last three premises which, thought they are quite plausible, are 
still indications of a certain philosophical viewpoint, the first premise is a statement 
about a well established property of a singlet state. 

The EPR argument is as follows. Because of the perfect anti-correlations (1.), we 
can predict with certainty the result of measuring either x component or y component 
of spin of particle 2 by previously choosing to measure the same quantity of particle 
1. By locality (2.), the measurement of particle 1 cannot cause any real change in 
particle 2. This implies that by the premise (3.), both the x and the y components 
of spin of particle 2 are elements of reality. This is also the case for particle 1 by 
a parallel argument where particle 1 and 2 interchange their roles. Yet, (according 
to Heisenberg's uncertainty principle) there is no quantum state of a single spin in 
which both x and y spin components have definite values. Therefore, by premise (4.) 
quantum mechanics cannot be a complete theory. 

In his answer J6), published in the same year and under the same title as of the 
EPR paper, Bohr criticized the EPR concept of "reality" as assuming the systems 
having intrinsic properties independently of whether they are observed or not and 
he argued for "the necessity of a final renunciation of the classical ideal of causality 
and a radical revision of our attitude towards the problem of physical reality." Bohr 
pointed out that the wording of the criterion of physical reality (3.) proposed by 
EPR contains an ambiguity with respect to the expression "without in any way dis- 
turbing the system". And, while, as Bohr wrote, there is "no question of mechanical 
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disturbance of the system", there is "the question of an influence on the very con- 
ditions which define the possible types of predictions regarding the future behavior 
of the system." Bohr thus pointed out that the results of quantum measurements, in 
contrast to these of classical measurements, depend on the complete experimental 
arrangement (context), which can even be non-local as in the EPR case. Before any 
measurement is performed only the correlations between the spin components of 
two particles, but not spin components of individual particles are defined. The x or 
y component (but never both) of an individual particle becomes defined only when 
the respective observable of the distant particle is measured. 

Perhaps the most clear way to see how strongly the philosophical viewpoints 
of EPR and Bohr differ is in their visions of the future development of quantum 
physics. While EPR wrote: "We believe that such [complete] a theory is possible", 
Bohr's opinion is that (his) complementarity "provides room for new physical law, 
the coexistence of which might at first sight appear irreconcilable with the basic 
principles of science." 



4 Bell's theorem 

Bell's theorem can be thought of as a disproof of the validity of EPR ideas. Elements 
of physical reality cannot be an internally consistent notion. A broader interpretation 
of this result is that a local and realistic description of nature, at the fundamental 
level, is untenable. Further consequences are that there exist quantum processes 
which cannot be medelled by any classical ones, not necessarily physical processes, 
but also some classical computer simulations with a communication constraint. This 
opened the possibility of development of quantum communication. 

We shall present now a derivation of Bell's inequalities. The stress will be put 
on clarification of the underlying assumptions. These will be presented in the most 
reduced form. 



4.1 Thought experiment 

At two measuring stations A and B, which are far away from each other, two char- 
acters Alice and Bob observe simultaneous flashy appearances of numbers +1 or 
— 1 at the displays of their local devices (or the monitoring computers). The flashes 
appear in perfect coincidence (with respect to a certain reference frame). In the 
middle between the stations is something that they call "source". When it is ab- 
sent, or switched off, the numbers ±l's do not appear at the displays. The activated 
source always causes two flashes, one at A, one at B. They appear slightly after a 
relativistic retardation time with respect to the activation of the source, never be- 
fore. Thus there is enough "evidence" for Alice and Bob that the source causes the 
flashes. The devices at the stations have a knob which can be put in two positions: 
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Alice 



Bob 




Fig. 1 Test of Bell's inequalities. Alice and Bob are two separated parties who share entangled 
particles. Each of them is free to choose two measurement settings 1 and 2 and they observe 
flashes in their detection station which indicate one of the two possible measurement outcomes +1 
or -1. 



m = 1 or 2 at A station, and n = 1 or 2 at B. Local procedures used to generate ran- 
dom choices of local knob positions are equivalent to independent, fair coin tosses. 
Thus, each of the four possible values of the pair n,m are equally likely, i.e. the 
probability P(n,m) = P(n)P(m) = ^. The "tosses", and knob settings, are made at 
random times, and often enough, so that the information on these is never avail- 
able at the source during its activation periods (the tosses and settings cannot have 
a causal influence on the workings of source). The local measurement data (setting, 
result, moment of measurement) are stored and very many runs of the experiment 
are performed. 



4.1.1 Assumptions leading to Bell's inequalities 

A concise local realistic description of such an experiment would use the following 
assumptions Q: 

1. We assume realism, which is any logically self-consistent model that allows one 
to use eight variables in the theoretical description of the experiment: A m „, B nm , 
where n,m = 1 , 2. The variable A nul gives the value, ±1, which could be obtained 
at station A, if the knob settings, at A and B, were at positions n, m, respectively. 
Similarly, B nm plays the same role for station B, under the same settings. This 
is equivalent to the assumption that a joint (non-negative, properly normalized) 
probability distribution of these variables, P(A \\ , A 1,2,^2,1 ,A2.2',B\,\ , B\ 2,^2. 1-62.2)-. 
is always allowed to existQ 

2. The assumption of locality does not allow influences to go outside the light cone. 



1 Note, that no hidden variables appear, beyond these eight. However, given a (possibly stochas- 
tic) hidden variables theory, one will be able to define our eight variables as (possibly random) 
functions of the variables in that theory. 
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3. Alice and Bob are free to choose their settings "at the whim". This the freedom, 
or "free will", often only a tacit assumption |8). A less provocative version of this 
assumption: There exists stochastic processes which could be used to choose the 
values of the local settings of the devices which are independent of the workings 
of the source, that is they neither influence it or are influenced by it. By the 
previous assumptions the events of activation of the source and of the choice and 
fixing of the local settings must be space-like separated. 

Note that when setting labels m, n are sent to the measurement devices, they will 
likely cause some unintended disturbance: by these assumptions any disturbance at 
A, as far as it influences the outcome at A, is not related to the coin toss nor to the 
potential outcomes at B, and vice versa. 

Note further, that A„ >m and B n ^ m are not necessarily actual properties of the sys- 
tems. The only thing that is assumed it that there is a theoretical description which 
allows one to use these all eight values. 

4.1.2 First consequences 

Let us write down the immediate consequences of these assumptions: 

• By locality: for all n,m: 

Am,n — ^mt Bn.m — B n (18) 

That is, the outcome which would appear at A does not depend on which setting 
might be chosen at B, and vice versa. Thus P(A\ \, . . . ,#2,2) can ^ e reduced to 
P(Ai,A 2 ,Bi,5 2 ). 

• By freedom 

(n,m) is statistically independent of (Ai,A 2 ,,Bi,,B 2 ). (19) 

Thus, the overall probability distributions for potential settings and potential out- 
comes satisfy 

P(n,m,A 1 ,A 2 ,B l ,B 2 )=P{n,m)p(A h A 2 ,B 1 ,B 2 ) (20) 

The choice of settings in the two randomizes, A and B, is causally separated from 
the local realistic mechanism, which produces the potential outcomes. 

4.1.3 Lemma: Bell's inequality 

The probabilities, Pr, of the four logical propositions, A„ = B m , satisfy 

Pr{A! =B 2 }-Pr{A! =Bi}-Pr{A2=Bi}-Pr{A 2 =B 2 } < 0. (21) 
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Proof: only four, or two, or none of the propositions, in the left hand side of the 
inequality can be true, thus ( f2TT >. QED. 

Now, if the observation settings are totally random (dictated by "coin tosses"), 
P(n,m) = j. Then, according to all our assumptions 

P(A„ = B m \n,m) = P(«,m)Pr{A„ = B„} = ^Pr{A„ = B m }, (22) 

Therefore, we have a Bell inequality: under the conjunction of the assumptions for 
the experimentally accessible probabilities one has 

P(Ai =B 2 | l,2)-P(Ai = B X | 1,1) -P(A 2 =B 1 | 2,1) -P{A 2 =B 2 | 2,2) < 0. 

(23) 

This is the well-known Clauser-Horne-Shimony-Holt (CHSH) inequality (9). 



4.2 The Bell theorem 

Quantum mechanics predicts for some experiments satisfying all the features of the 
thought experiment the left hand side of inequality ( 1231 to be as high as \/2 — 1 , 
which is larger that the local realistic bound 0. Hence, one has Bell 's theorem 4701/ : 
;/ quantum mechanics holds, local realism, defined by the full set of the above as- 
sumptions, is untenable. But, how does nature behave - according to local realism 
or quantum mechanics? It seems that we are approaching the moment, in which one 
could have as perfect as possible laboratory realization of the thought experiment 
(locality loophole was closed in ifTTl [121 . detection loophole in |[T3l and in recent 
experiment measurement settings were space-like separated from the photon pair 
emission lfT4l ). Hence local realistic approach to description of physical phenom- 
ena is close to be shown untenable too. 



4.2.1 The assumptions as a communication complexity problem 

Assume that we heave two programmers P^, where k=l,2, each possessing an enor- 
mously powerful computer. They share certain joint classical information strings of 
arbitrary lengths and/or some computer programs. All these will be collectively de- 
noted as X. But, once they both posses X, no communication whatsoever between 
them is allowed. After this initial stage, each one of them gets from a Referee a 
one bit random number E {0, 1}, known only to him/her (Pi knows only x\, P 2 
knows only x 2 ). The individual task of each of them is to produce, via whatever 
computational program, a one bit number 4 (xk, X), and communicate only this one 
bit to a Referee, who just compares the received bits. There is no restriction on the 
form and complication of the possibly stochastic functions 4, or any actions taken 
to define the values, but any communication between the partners is absolutely not 
allowed. The joint task of the partners is to devise a computer code which under the 
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constraints listed above, and without any cheating, allows to have after very many 
repetitions of the procedures (each starting with establishing a new shared A) the 
following functional dependence of the probability that their bits sent back to the 
Referee are equal: 

P{h Oi) = h{x 2 )} = \ + \ cos [ - tt/4 + (tt/2) (xi + x 2 )] . (24) 

This is a variant of communication complexity problems. The current task is abso- 
lutely impossible to achieve with the classical means at their disposal, and without 
communication. Simply because whatever is the protocol 

Pr{/ 1 (l)=/ 2 (l)}-Pr{/ 1 (0)=/ 2 (0)}-Pr{/ 1 (l)=/ 2 (0)}-Pr{/ 1 (0)=/ 2 (l)} < 0. 

(25) 

whereas, the value of this expression in quantum strategy Pq can be as high as y/2 — 
1. If the programmers use entanglement as resource and receive their respective 
qubits from an entangled pair (e.g. singlet) during the communication stages (when 
X is established), one can obtain on average Pq. Instead of computing, the partners 
make a local measurement on their qubits. They measure Pauli observables n • a, 
where ||n|| = 1. Since the probability for them to get identical results, n,r 2 , f° r 
observation directions im ,n 2 is 

Pq{ti = r 2 |ni,n 2 } = ---ni-n 2 , (26) 

for suitably chosen ni(xi),n 2 (x 2 ) they get values of Pq equal to those in (f24b . The 
messages sent back to the Referee encode the local results of measurements of ni • 
(7 (E> n 2 • a, and the local measurement directions are suitably chosen as functions 
of x\ and x 2 . We will come back to the relation between Bell's inequalities and 
quantum communication complexity problems in more details in Sec. [6] 

4.2.2 Philosophy or physics? Which assumptions? 

The assumptions behind Bell inequalities are often criticized as being "philosophi- 
cal". If one reminds oneself on Mach's influence on Einstein, philosophical discus- 
sions related to physics may be very fruitful. 

For those who are, however, still skeptical one can argue as follows. The whole 
(relativistic) classical theory of physics is realistic (and local). Thus we have an 
important exemplary realization of the postulates of local realism. Philosophical 
propositions could be defined as those which are not observationally or experimen- 
tally falsifiable at the given moment of the development of human knowledge, or in 
pure mathematical theory are not logically derivable. Therefore, the conjunction of 
all assumptions of Bell inequalities is not a philosophical statement, as it is testable 
both experimentally and logically (within, known at the moment, mathematical for- 
mulation of fundamental laws of physics). Thus, Bell's theorem removed the ques- 
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tion of possibility of local realistic description from the realm of philosophy. Now 
this is just a question of a good experiment. 

The other criticism is formulated in the following way. Bell inequalities can be 
derived using a single assumption of existence of joint probability distribution for 
the observables involved in them, or that the probability calculus of the experimen- 
tal propositions involved in the inequalities is of Kolmogorovian nature, and nothing 
more. But if we want to apply these assumptions to the thought experiment we stum- 
ble on the following question: does the joint probability take into account full ex- 
perimental context or not?. The experimental context is in our case (at least) the full 
state of the settings (m, n). Thus if we use the same notation as above for the realistic 
values, this time applied to the possible results of measurements of observables, ini- 
tially we can assume existence of only p(A\, i,Ai,2,A2,i,A2, 2;#i,i, #1,2, #2, 1,^2, 2)- 
Note that such a probability could be e.g. factorizable into Y[ n , m P{A, hmi B n:m ). That 
is one could in such a case have different probability distributions pertaining to 
different experimental contexts (which can even be defined through the choice of 
measurement settings in space-like separated laboratories!) 

Let us discuss this from the quantum mechanical point of view, only because 
such considerations have a nice formal description within this theory, familiar to all 
physicists. Two observables, say Aj ®B\ and A2<E>B2, as well as other possible pairs 
are functions of two different maximal observables for the whole system (which are 
non-degenerate by definition). If one denotes such a maximal observable linked with 
A m <g> B„ by M m>n and its eigenvalues by M m>n the existence of the aforementioned 
joint probability is equivalent to the existence of a p{M\ \ ,M\^,M2,\ ,Mi,i) in form 
of a proper probability distribution. Only if one assumes additionally context inde- 
pendence, this can be reduced to the question of existence of (non-negative) prob- 
abilities P(A\ ,A2,B\ ,B2), where A m and B„ are eigenvalues of A m (g> 1 and 1 <E> B n , 
where it turn 1 is the unit operator for the given subsystem. While context indepen- 
dence is physically doubtful, when the measurements are not spatially separated, 
and thus one can have mutual causal dependence, it is well justified for spatially 
separated measurements. I.e., locality enters our reasoning, whether we like it or 
not. Of course one cannot derive any Bell inequality of the usual type if the ran- 
dom choice of settings is not independent of the distribution of A{,A2,B\,B2, that is 
without (120b . 

There is yet another challenge to the set of assumptions presented above. It is 
often claimed, that realism can be derived, once one considers the fact that maxi- 
mally entangled quantum systems reveal perfect correlations, and one additionally 
assumes locality. Therefore it would seem that the only basic assumption behind 
Bell inequalities is locality, with the other auxiliary ones of freedom. Such a claim 
is based on the ideas of EPR, who conjectured that one can introduce "elements of 
reality" of a remote system, provided this system is perfectly correlated with another 
system. To show the fallacy of such a hope, let us now discuss three particle correla- 
tions, in the case of which consideration of just few "elements of reality" reveals that 
they are a logically inconsistent notion. Therefore, they cannot be a starting point 
for deriving a self-consistent realistic theory. The three particle reasoning is used 
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here because of its beauty and simplicity, not because one cannot reach a similar 
conclusion for two particle correlations. 



4.3 Bell's theorem without inequalities: three entangled particles 
or more 

As the simplest example, take a Greenberger-Horne-Zeilinger fl31 (GHZ) state of 
N = 3 particles (fig. 2): 

\GUZ) = -L(\a)\b)\c) + \a')\b')\c')) (27) 

where (x|;c') = (x = a,b,c, and kets denoted by one letter pertain to one of the 
particles). The observers, Alice, Bob and Cecil measure the observables: A(fa), 
B(fa), C{fa), defined by 

x(fa) = l+,frX+,frl - I- &><- tel (28) 

where 

\±,fa) = -J= (±i\x') + exp(/<fc)|x)) . (29) 

and X = A,B,C. The quantum prediction for the expectation value of the product of 
the three local observables is given by 

E(<j> A ,<l> B ,(j> c ) = (GHZ|A(</> A )fi(<fe)C(</> c )|GHZ) = sin ( fa + fa + fa). (30) 

Therefore, if fa + fa + fa = n/2 + kit, quantum mechanics predicts perfect corre- 
lations. For example, for fa = n/2, fa = and fa = 0, whatever may be the results 
of local measurements of the observables, for say the particles belonging to the i- 
th triple represented by the quantum state |GHZ), their product must be unity. In a 
local realistic theory one would have 

A i (7t/2)B i (0)C i (0) = l, (31) 

where X'((j)), X = A,B or C is the local realistic value of a local measurement of 
the observable X((j>) that would have been obtained for the r'-th particle triple if 
the setting of the measuring device is (j). By locality X'((j>) depends solely on the 
local parameter. The eq. PTT ) indicates that we can predict with certainty the result 
of measuring the observable pertaining to one of the particles (say c) by choosing 
to measure suitable observables for the other two. Hence the value X'(<j>) are EPR 
elements of reality. 

However, if the local apparatus settings are different one would have had, e.g. 



A I '(0)B , '(0)C , '(7r/2) = l, 



(32) 
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Fig. 2 Test of the GHZ theo- 
rem. Alice, Bob and Cecil are 
three separated parties who 
share three entangled parti- 
cles in the GHZ state. Each 
of them are free to choose 
between two measurement 
settings 1 and 2 and they ob- 
serve flashes in their detection 
station which indicate one of 
the two possible measurement 
outcomes +1 or - 1 . 




1 ff 



A i (0)B i (n/2)C i (0) = l, (33) 
A 1 (n/2)B l (n/2)C l (lz/2) = -1. (34) 

Yet, the four statements (1311134b are inconsistent within local realism. Since X' ((f)) = 
±1, if one multiples side by side the eqs. d31H34b . the result is 

1 = -1. (35) 

This shows that the mere concept of existence of "elements of physical reality" as 
introduced by EPR is in a contradiction with quantum mechanical predictions. We 
have a "Bell's theorem without inequalities" ifTBI . 

Some people still claim that EPR correlations together with the assumption of 
locality allow one to derive realism. The above example clearly shows that such a 
realism would allow one to infer that 1 = — 1 . 



4.4 Implications of Bell's theorem 

Violations of Bell's inequalities imply that the underlying conjunction of assump- 
tions of realism, locality and "free will" is not valid, and nothing more. 

It is often said that the violations indicate "(quantum) non-locality". However if 
one wants non-locality to be the implication, one has to assume "free will" and real- 
ism. But this is only at this moment a philosophical choice (it seems that there is no 
way to falsify it). It is not a necessary condition for violations of Bell's inequalities. 

The theorem of Bell shows that even a local inherently probabilistic hidden- 
variable theory cannot agree with all predictions of quantum theory (we base our 
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considerations on p{A\ ,A2,Bi ,B<i) without assuming its actual structure, or whether 
the distribution for a single run is essentially deterministic, all we require is a joint 
"co-existence" of the variables A \ , . . . , B2 in a theoretical description). Therefore the 
above statements cover theories that treat probabilities as irreducible, and for which 
one can define p(A{ ,A2,B\ ,-82)- Such theories contradict quantum predictions. This, 
for some authors indicates that nature is non-local. While the mere existence of 
Bohm's model HI 61 demonstrates that non-local hidden- variables are a logically 
valid option, we now know that there are plausible models, such as Leggett's crypto- 
nonlocal hidden-variable model ifTTl . that are in disagreement with both quantum 
predictions and experiment lfl"8l . But, perhaps more importantly, if one is ready to 
consider inherently probabilistic theories, then there is no immediate reason to re- 
quire the existence of (non-negative and normalized) probabilities p{A\ j ...,-82,2)- 
Violation of this condition on realism, together with locality, which allows one to 
reduce the distribution to p{A\ : ...^2), is not in a direct conflict with the theory 
of relativity, as it does not necessarily imply the possibility of signalling superlu- 
minally. To the contrary, quantum correlations cannot be used for direct commu- 
nication between Alice to Bob, but still violate Bell's inequalities. It is therefore 
legitimate to consider quantum theory as a probability theory subject to, or even 
derivable from more general principles, such as non-signaling condition fT9ll20l or 
information theoretical principles l2Tll22ll . 

Note that complementarity, inherent in quantum formalism^, completely contra- 
dicts the form of realism defined above. So why quantum-non-locality? 

To put it short, Bell's theorem does not imply any property of quantum mechan- 
ics. It just tells what it is not. 



5 All Bell's inequalities for two possible settings on each side 

We shall now present a general method of deriving all standard Bell inequalities 
(that is Bell's inequalities involving two-outcome measurements and with two set- 
tings per observer). Although these will not be spelled out explicitly, all the as- 
sumptions discussed above are behind the algebraic manipulations leading to the 
inequalities. We present in detail a derivation for two-observer problem, because 
the generalization to more observers is, surprisingly, obvious. 

Consider pairs of particles (say, photons) simultaneously emitted in well defined 
opposite directions. After some time the photons arrive at two very distant measur- 
ing devices A and B operated by Alice and Bob. Alice, chooses to measure either 
observable A 1 or A2, and Bob either B\ or Z?2- The hypothetical results that they may 
get for the y'-th pair of photons are A-j and A } 2 , for Alice's two possible choices, and 
B\ and B\, for Bob's. The numerical values of these results (+1 or —1) are defined 
by the two eigenvalues of the observables. 



2 Which can be mathematically expressed as non-existence of joint probabilities for non- 
commuting, i.e. non-commeasurable, observables. 
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Since, always either \B\ -B{ \ = 2 and \B\ +B ] 2 \= 0, or \b{ -B{\ = and \B\ + 
B J 2 \ = 2, with a similar property of Alice's hypothetical results the following relation 
holds 

\A[±A i 1 \-\B[±B i 2 \=Q (36) 
for all possible sign choices within d36*l > except one, for which one has 4. Therefore 

£ |(A{ + (-l)^)(^ + (-l) I 2^)|=4 ) (37) 

k,l=0 

or equivalently one has the set of identities 
l 

£ S(s uS2 )l(A J l +s 1 A J 2 )(B J l + S2 B J 2 )}=±4, (38) 

si,s 2 =-l 

with any S(s\,s 2 ) = ±1. There are 2 2 ~ = 16 such S functions. 

Imagine now that N pairs of photons are emitted, pair by pair (N is sufficiently 
large, such that J 1 /N < 1). The average value of the products of the local values 
is given by 

1 N 

E(A„,B m ) = — ^A J n B J m , (39) 

where n,m = 1,2. 

Therefore after averaging, the following single Bell-type inequality emerges: 
£ \E(A l ,B 1 ) + (-l) l E(A l ,B 2 ) + (-l) k E(A 2 ,B l ) + (-l) k+l E(A 2 ,B 2 )\ < 4, 

k.l=0 

(40) 

or equivalently a series of inequalities: 
l 

£ S{si,S2)[E{A u B l )+ S 2E{A l ,B 2 )+siE{A2,Bi)+siS2E{A2,B 2 )] <4. 

si,«2=-l 

(41) 

As the choice of measurement settings is assumed to be statistically independent of 
the working of the source, i.e of the distribution of Ai's, A 2 s, Bi's and B 2 s, the 
averages E(A n ,B,„) cannot differ much, for high N, from the actually observed ones 
in the subsets of runs for which the given pair of settings was selected. 
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5.1 Completeness of the inequalities 

The inequalities form a complete set. That is, they define the faces of the convex 
polytope formed out of all possible local realistic models for the given set of mea- 
surements. Whenever local realistic model exists inequality d40l is satisfied by its 
predictions. To prove the sufficiency of condition (l40b we construct a local realistic 
model for any correlation functions which satisfy it, i.e. we are interested in the local 
realistic models for E^ k such that they fully agree with the measured correlations 
E(k\,k<i) for all possible observables k\,k 2 = 1,2. 

One can introduce E which is a "tensor" or matrix built out of Eij, with i,j = 1,2. 
If all its components can be derived from local realism, one must have 

l 

E LR = £ P(A,B)A®B, (42) 

A,B= 1 

with A = (Ai(ni),iiAi(n 2 )), B = (A 2 (m), ^2(112)), where s h s 2 6 {-1,1} and 
nonnegative normalized probabilities P(A,B). 

Let us ascribe for fixed ii,52, a hidden probability that A 7 (ni ) =SjAj(ii2) (with 
./' = 1 , 2) in the form familiar from Eq. (140) : 

P(si,S2) = \\ t ^s^EikM (43) 

k 2 .k 2 = l 

Obviously these probabilities are positive. However they sum up to identity only 
if inequality (l40l > is saturated, otherwise there is a "probability deficit", AP. This 
deficit can be compensated without affecting correlation functions. 

First we construct the following structure, which is indeed the local realistic 
model of the set of correlation functions if the inequality is saturated: 

1 

£ E{s u s 2 )P(s u s 2 ){\ lSx )®(\,s 2 ), (44) 

il,J2=-l 

where E(si,s 2 ) is the sign of the expression within the modulus in Eq. (l43l l. 
Now if AP > 0, we add a "tail" to this expression given by: 

Ap 1 1 1 1 

-7Z E E E E (A U A 2 )®(B U B 2 ). (45) 

10 A l =-\A 2 =-\B l =-\B 1 =-\ 

This "tail" does not contribute to the values of the correlation functions, because it 
represents the fully random noise. The sum of (PPfl i is a valid local realistic model 
for£ = (E(1,1),£:(1,2),£:(2,1),E(2,2)). The sole role of the "tail" is to make all 
hidden probabilities to add up to 1 . 

To give the reader some intuitive grounds for the actual form of, and the com- 
pleteness of the derived inequalities, we shall now give some remarks. The gist is 
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that the consecutive terms in the inequalities are just expansion coefficients of the 
tensor E in terms of a complete orthogonal sequence of basis tensors. Thus the ex- 
pansion coefficients represent the tensors in a one-to-one way. 

In the four dimensional real space where both Elr and E are defined one can 
find an orthonormal basis set S SlSl = j(l,s\) ® {l,Sz). Within these definitions the 
hidden probabilities acquire a simple form: 

P(s u s 2 ) = ^\S SlS2 -E\, (46) 

where the dot denotes the scalar product in R 4 . Now the local realistic correlations, 
Em, can be expressed as: 

1 

E L r= £ \S SlS2 -E\E(s u s 2 )S. HS2 . (47) 

il,i2=— 1 

The modulus of any number \x\ can be split into |x| = x sign(x), and we can always 
demand the product Ai(ni)A2(ni) to have the same sign as the expression inside 
the modulus. Thus we have: 

l 

E= £ (S SlS2 -E)S SlS2 . (48) 

ii,j 2 =-i 

The expression in the bracket is the coefficient of tensor E in the basis S SlS2 . These 
coefficients are then summed over the same basis vectors, therefore the last equality 
appears. 



5.2 Two-qubit states that violate the inequalities 



A general two qubit state can be put in the following concise form 



(49) 



' M.v=0 



The two qubit correlation function for measurements of spin 1 along direction n(l) 
and of spin 2 along n(2) is given by 



and it reads 



£ eM (n(l),n(2)) = Tr [p (n(l) • a 1 <g> n(2) • a 2 



£ eM (n(l),n(2)) = £ 7y«(l ),•«(% 



(50) 



(51) 
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Two particle correlations are fully defined once one knows the components of 7},-, 
i,j = 1,2,3, of the tensor T. Equation ( Bil l can be put into a more convenient form: 



£ 2M (n(l),n(2))=f.n(l)<g>n(2), 



(52) 



where "•" is the scalar product in the space of tensors, which in turn is isomorphic 
with R 3 ®R 3 . 

Quantum correlation £g^ (n(l),n(2)) can be described by a local realistic model 
if, and only if, for any choice of the settings n(l)* 1 and n(2) k2 , where ki,lc2 = 1,2, 
one has 



j t f*[n(iy + (-l) k n(l) 2 }®in(2y + (-iyn(2) 2 )} 



< 1. 



(53) 



Since there always exist two mutually orthogonal unit vectors a(x) 1 and a(x) 2 such 
that 

n(x) 1 + (-l) k n(x) 2 = 2a(x) k a(x) k with k=l,2 (54) 
and with a(x)\ = cos 9(x), a(x)2 = sin 9(x), one obtains 



E 

k,I=l 



o(l)ita(2),t«a(l)*®a(2)' 



< 1. 



(55) 



Note that T • a(l) ® a(2)' is a component of the tensor T after a transformation of 
the local coordinate systems of each of the particles into such ones where the two 
first basis vectors are a(x) 1 and a(x) 2 . We shall denote such transformed components 
again by T kh 

The necessary and sufficient condition for a two-qubit correlation to be described 
within a local realistic model is that in any plane of observations for each particle 
(defined by the two observation directions) one must have 



£ |a(l) t a(2),r„| < 1. 

k.l=\ 



(56) 



for arbitrary a(l)k, a(2)/. 

Using the Cauchy inequality one obtains 



£ |o(i)*o(2),r„| < 
k,i=i 



i 



k,l=l 



Therefore, if 



kj=l 



(57) 



(58) 
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for any set of local coordinate systems, the two particle correlation functions of 
the form of ( T5TT > can be understood within the local realism (in a two settings per 
observer experiment). 

This condition is both necessary and sufficient. 



5.2.1 Sufficient condition for violation of the inequality 

The full set of inequalities is derivable from the identity d38l l where we put non- 
factorable sign function S(s\,s 2 ) = i(l +s{) + (1 — S\)s 2 . In this case one obtains 
the CHSH inequality in its standard form: 



<(Ai +A 2 )B l + (Ai -A 2 )B 2 ) 



av(> 



< 2, (59) 



where (-^arg denotes average. All other non-trivial inequalities are obtainable 
by all possible sign changes — > — (with k = 1,2 and X = A,B). It is easy 
to see that factorizable sign functions, such as e.g. S(si,s 2 ) = s\s 2 , lead to triv- 
ial inequalities \E(A n ,B m )\ < 1. As noted above the quantum correlation function 
^(a^b/) is given by the scalar product of the correlation tensor T with the ten- 
sor product of the local measurement settings represented by unit vectors ak (£> b/, 
i.e. £g( a ^b/) = (atfEibi) -T. Thus, the condition for a quantum state endowed 
with the correlation tensor T to satisfy the inequality ( 1591 , is that for all directions 
ai,a 2 ,bi,b 2 one has 



,ai +a2 x , ,ai — a 2 . 
(^T^)®bi + (^— -)®b 2 



<1, (60) 

where both sides of $5% were divided by 2. 

Next notice that A± = j(a\ ±a 2 ) satisfy the following relations: A + ■ A_ = 
and ||A+|| 2 + ||A_|| 2 = 1. Thus A+ + A_ is a unit vector, and A± represent its 
decomposition into two orthogonal vectors. If one introduces unit vectors a± such 
that A± = a±a±, one has a 2 + +a 2 _ = 1. Thus one can put inequality (|60| | into the 
following form: 

|S-f|<l, (61) 

where S = a + a + £g>bi +a_a_ ®\} 2 - Note that since a + a = 0, one has S S = 1, i.e. 
S is a tensor of unit norm. Any tensor of unit norm, U, has the following Schmidt 
decomposition U = X\\\ <g> Wj + X 2 \ 2 ® v/ 2 , where v,- • Vj = Sy, W; • = and 
A 2 + A| = 1 . The (complete) freedom of the choice of the measurement directions 
bi and b 2 , allow one by choosing b 2 orthogonal to bi to put S in the form isomorphic 
with U, and the freedom of choice of ai and a 2 allows A + and A_ to be arbitrary 
orthogonal unit vectors, and a+ and a to be also arbitrary. Thus S can be equal 
to any unit tensor. To get the maximum of the left hand side of (1601 . we Schmidt 
decompose the correlation tensor, and take two terms of the decomposition which 
have the largest coefficients. In this way we get a tensor T', of Schmidt rank two. 
We put S = j|A|7 f ', and the maximum is | |T ' 1 1 T'. Thus, in other words, 
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2 

max [ £ T%] < 1 (62) 

k,l=l 

is the necessary and sufficient condition for the inequality ( |40l > to hold, provided the 
maximization is taken over all local coordinate systems of two observers. The con- 
dition is equivalent to the necessary and sufficient condition of Horodeccy Family 
E3 for violation of the CHSH inequality. 



5.3 Bell's inequalities for N particles 

Let us consider a Bell inequality test with N observers. Each of them chooses be- 
tween two possible observables, determined by local parameters n\(j) and i^Q), 
where j = l,...,N. Local realism implies existence of two numbers A\ and A J 2 , each 
taking values +1 or -1, which describe the predetermined result of a measurement 
by the y'-th observer for the two observables. The following algebraic identity holds: 

£ £(^,...,^^{+^=±2", (63) 

ji,...,j#=-i j=i 

where S(si,...,Sff) is an arbitrary "sign" function, i.e. S(s\, ...,sn) = ±1. It is a 
straightforward generalization of the one for two observers as given in fiTt . The 
correlation function is the average over many runs of the experiment k? = 

{Ylj=iAi.)avg with k\ , ...fcy G { 1 , 2}. After averaging d63l over the ensemble of the 

i i 

runs one obtains the Bell inequalities [j 

| £ S(s u ...,s N ) £ s k r 1 ...s k ^- 1 E ku ... tkN \<2 N . (64) 

£ l ,. .. ,A"^v=— 1 A:^ ,. .. ,kj\]=l 

Since there are 2 different functions 5, the above inequality represents a set of 2 
Bell inequalities. 

All these boil down to just one inequality (!): 

£ | £ 4'- 1 ..4~-'£, 1 kN \<2 N , (65) 

Il,...,Stf=— 1 fc] ^ = 1 

The proof of this fact is trivial exercise with the use of the property that either |X| = 1 
or |X| = —1, where X is a real number. This inequality was derived independently 
in Refs l24l and 11251 . The presented derivation follows mainly Ref. l26l . 



This set of inequalities is a sufficient and necessary condition for the correlation functions enter- 
ing them to have a local realistic model. Compare it to the two particle case. 
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A general N-qubit state can be put in the form 

P = i I ^•••^«=i^)- (66) 

A*i.—,MiV=0 

Thus, the N qubit correlation function has the following structure 

£ eM (n(l),n(2),...,n(iV)) = t«n(l)(8n(2)...(g)n(iV), (67) 

where T stands for an N index tensor, with components 7^...^, where kj = 1,2,3. 
The necessary and sufficient condition for a description of the correlation function 
within local realism in the general case reads 

2 

£ \a{\) h a{2) h ...a{N) kN T hk2 ... kN \<\. (68) 
for any possible choice of local coordinate systems for individual particles. Again if 

I Tl. kN <l (69) 

ki,...,k N = \ 

for any set of local coordinate systems, the A^-qubit correlation function can be de- 
scribed by a local realistic model. The proof of these fact are generalizations of the 
ones presented earlier pertaining to two particles. The sufficient condition for vio- 
lation of the general Bell's inequality for N particles by a general state of N qubits 
can be found in Ref. |26ll . 



5.5 Concluding remarks 

The inequalities presented above represent the full set of standard "tight" Bell's 
inequalities for an arbitrary number of parties. Any non tight inequality is weaker 
than tight ones. Such Bell's inequalities can be used to detect entanglement, not 
as efficiently as entanglement general witnesses. However, they have the advantage 
over the witnesses that they are systems-independent. They detect entanglement no 
matter what is the actual Hilbert space that describes the subsystems. 

As we shall show below the Bell inequalities analyzed above also show that the 
entanglement violating them is directly applicable in some quantum informational 
protocols that beat any classical ones of the same kind. This will be shown via an 
explicit construction of such protocols. 
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6 Quantum reduction of communication complexity 

In his review paper entitled "Quantum Communication Complexity (A Survey)" 
Brassard 1281 posed a question: "Can entanglement be used to save on classical 
communication ? " He continued that there are good reasons to believe at first that 
the answer to the question is negative. Holevo's theorem [29] states that no more 
than n bits of classical information can be communicated between parties by the 
transmission of n qubits regardless of the coding scheme as long as no entanglement 
is shared between parties. If the communicating parties share prior entanglement, 
twice as much classical information can be transmitted (this is so called "superdense 
coding" |30l ), but no more. It is thus reasonable to expect that even if the parties 
share entanglement no savings in communication can be achieved beyond that of 
the superdense coding (2n bits per n qubits transmitted). 

It is also well known that entanglement alone cannot be used for communication. 
Local operations performed on any subsystem of an entangled composite system 
cannot have any observable effect on any other subsystem; otherwise it could be 
exploited to communicate faster than light. One would thus intuitively conclude that 
entanglement is useless for saving communication. Brassard, however, concluded 
"... all the intuition in this paragraph is wrong." 

The topic of classical communication complexity was introduced and first stud- 
ied by Andrew Yao in 1979 IT3T1 . A typical communication complexity problem can 
be formulated as follows. Let Alice and Bob be two separated parties who receive 
some input data of which they know only their own data and not the data of the 
partner. Alice receives an input string x and Bob an input string y and the goal is for 
both of them to determine the value of a certain function f(x,y) . Before they start the 
protocol Alice and Bob are even allowed to share (classically correlated) random 
strings or any other data, which might improve the success of the protocols. They 
are allowed to process their data locally in whatever way. The obvious method to 
achieve the goal is for Alice to communicatee to Bob, which allows him to compute 
f(x,y). Once obtained, Bob can then communicate the value f(x,y) back to Alice. 
It is the topic of communication complexity to address the questions: Could there 
be more efficient solutions for some functions f(x,y)? What are these functions? 

A trivial example that there could be more efficient solutions then the obvious 
one given above is a constant function f(x,y) ~c, where c is a constant. Obviously 
here Alice and Bob do not need to communicate at all, as they can simply take c for 
the value of the function. However there are functions for which the only obvious 
solution is optimal, that is only transmission of x to Bob warrants that he reaches the 
correct result. For instance, it is shown that n bits of communication are necessary 
and sufficient for Bob to decide whether or not Alice's n-bit input is the same as his 
one 

Generally one might distinguish the following two types of communication com- 
plexity problems: 

1 . What is the minimal amount of communication (minimal number of bits) re- 
quired for the parties to determine the value of the function with certainty? 
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2. What is the highest possible probability for the parties to arrive at the correct 
value for the function if only a restricted amount of communication is allowed? 

Here we will consider only the second class of problems. Note that in this case one 
does not insist on the correct value of the function to be obtained with certainty. 
While an error in computing the function is allowed, the parties try to compute it 
correctly with as high probability as possible. 

From the perspective of the physics of quantum information processing the nat- 
ural questions is: Are there communication complexity tasks for which the parties 
could increase the success in solving the problem if they share prior entanglement? 
In their original paper Cleve and Buhrman [33 1 showed that entanglement can in- 
deed be used to save classical communication. They showed that to solve a certain 
three-party problem with certainty the parties need to broadcast at least 4 bits of 
information, in a classical protocol, whereas in the quantum protocol (with entan- 
glement shared) it is sufficient for them to broadcast only 3 bits of information. This 
was the first example of a communication complexity problem that could be solved 
with higher success than it is be possible with any classical protocol. Subsequently, 
Buhrman, Cleve and van Dam |j34l found a two-party problem that can be solved 
with a probability of success exceeding 85% and 2 bits of information communi- 
cated if prior shared entanglement is available, whereas the probability of success in 
a classical protocol could not exceed 75% with the same amount of communication. 

The first problem whose quantum solution requires significantly smaller amount 
of communication compared to classical solutions was discovered by Buhrman, van 
Dam, H0yer and Tapp l35l . They considered a fc-party task which requires roughly 
kink bits of communication in a classical protocol, and exactly k bits of classical 
communication if the parties are allowed to share prior entanglement. The quan- 
tum protocol of Ref. Il34ll is based on the violation of the CHSH inequality by 
two-qubit maximally entangled state. Similarly, the quantum protocols of multi- 
party problems ll34l 1331 [35l are based on an application of the GHZ-type argument 
against local realism for multi-qubit maximally entangled states. Galvao 11361 has 
shown an equivalence between the CHSH and GHZ tests for three particles and the 
two- and three-party quantum protocols of Ref. 1341 . respectively. In a series of pa- 
pers (37] [38, 39 40] it was shown that entanglement violating a Bell inequality can 
always be exploited to find a better-than-any-classical solution to some communi- 
cation complexity problems. In this brief overview we mainly follow the approach 
introduced in these papers. The approach has been further developed and applied in 
Ref. (See also Ref. |43)). 



6.1 The problem and its optimal classical solution 

Imagine several spatially separated partners, P\ to fV, each of whom has some data 
known to him/her only, denoted here as X{, with i = l,...,N. They face a joint task: 
to compute the value of a function T(X\ ,...,Xn). This function depends on all data. 
Obviously they can get the value of T by sending all their data to partner Pn, who 
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does the calculation and announces the result. But are there ways to reduce the 
amount of communicated bits, i.e. to reduce the communication complexity of the 
problem? 

Assume that every partner Pf. receives a two bit string = (zk,Xk) where Zk,Xk £ 
{0, 1}. We shall consider specific task functions which have the following form 

where f £ {0,1} the sum in the exponent is modulo 2. The partners know also 
the probability distribution ("promise") of the bit strings ("inputs"). There are two 
constraints on the problem. Firstly, we shall consider only distributions, which are 
completely random with respect to Zk's, that is a class of the form p(X\, ...,Xn) = 
2 p'(x\, ...,xn). Secondly, communication between the partners is restricted to 
N —I bits. Assume that we ask the last partner to give his/her answer A(X\ ,...,Xn), 
equal to ±1, to the question what is functional value T(Xi,...,Xff) in each run for 
the given set of inputs Xi,...,X N . 

For simplicity, we shall introduce now = (— l) z *, y^ £ { — 1,1}. We shall use y^ 
as a synonym of Zk- Since T is proportional to lit Jib the final answer A is completely 
random if it does not depend on every y^. Thus, information on z^'s from all N — 1 
partners must somehow reach Pn- Therefore the only communication "trees" which 
might lead to a success are those in which each P^ sends only a one-bit message 
ink £ {0,1}. Again we introduce: e^ = ( — 1)"' A , e^ £ { — 1,1}, and will treat is as 
synonym of rrik. 

The average success of a communication protocol can be measured with the fol- 
lowing fidelity function 

F= Y, p(X U -Xn)T(X u ...X n )A(X 1 ,...X n ), (70) 
x l ,...,x N 

or equivalently 

^ 1 IN 

F= 2" ^ p'(x u ...,x N )f(xi,...,x N ) n?*^* 1 '-"'*^ 1 '"-'^- 

*1,...,xjv=0 yi,...y N =-lk=l 

(71) 

The probability of success is P = (1 + F)/2. 

The first steps of a derivation of the reduced form of the fidelity function for 
an optimal classical protocol will now be presented (the reader may reconstruct the 
other steps or consult references 03811391 "). In a classical protocol the answer A of the 
partner Pn can depend on the local input y^, xn, and messages, e (1 , received 
directly from a subset of I partners P^ , Ph : 

A=A(x N ,y N ,e il ,...,e il ). (72) 

Let us fix xyv, and treat A as a function A XN of the remaining / + 1 dichotomic vari- 
ables 

yN,ei v ...,ei r 
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That is, we treat now xn as a fixed index. All such functions can be thought of as 2 /+1 
dimensional vectors, because the values of each such a function form a sequence of 
the length equal to the number of elements in the domain. In the 2 ,+1 dimensional 
space containing such functions one has an orthogonal basis given by 

/ 

Vjh...ji(yN,ei 1 ,...,e il ) =y } N Y[e-*, (73) 

k=\ 

where G {0,1}. Thus, one can expand A(xN,yN,^ii , ■■■> e ii) with respect 

to this basis and the expansion coefficients read 

1 1 

c jh-h { x n) ~ E A(x N ,y N7 e il7 ...,e il )Vjj l ...j l {y N ,e il ,...,e il ). (74) 

W,e>j ,...,«;,=- 1 

Since \A \ = \Vju j. \ = 1, one has |c;; 1 .../,(jcjv)| < 1. We put the expansion into the 
expression for F and obtain 



1 1 N 

xi,...,Xfj=0 y{,...,yu=—\h=\ 



1 / 

iJi,..-A=o *=i 



(75) 

where g(x 1} ...,x N ) = f(x u . . . ,x N )p'(x u . . . ,x N ). Because Y^-iVnVn = °> and 

=-i y ke k = ^' on ^ trie term w t tri a ^ 7>ii j •• • j equal to unity can give a non-zero 
contribution to F. Thus, A in F can be replaced by 

I 

A' = y N c N (x N ) Y[ e ik , (76) 
k=\ 

where cn(xn) stands for c\\ ^i(xn)- Next, notice that, for example, e, p can depend 
only on local data jc,-, , y, , and the messages obtained by Pj { from a subset of partners: 
e Pl> ...,e Pm . This set does not contain any of the eu's of the formula d76l l above. In 
analogy with A, the function e (1 , for a fixed x- ly , can be treated as a vector, and 
thus can be expanded in terms of orthogonal basis functions (of a similar nature as 
eq. (l73l). etc. Again, the expansion coefficients satisfy \dii : (x^ ) | < 1 . If one puts 
this into A', one obtains a new form of F, which after a trivial summation over y^ 
and y,-, depends on Ci\r(xiy)c,-j(jCi)rijL=2 e »*' wn ere Cjj(x ( -) stands for c' n and 
its modulus is again bounded by 1 . Note that, y^ and y (1 disappear, as y\ = 1 . 

As each message appears in the product only once, we continue this procedure of 
expanding those messages which depend on earlier messages, till it halts. The final 
reduced form of the formula for the fidelity of an optimal protocol reads 

1 N 

F = g(x U ---,x N )Y[c„(x„), (77) 

.Yl,...,.V,v=0 11=1 
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with |c„(x„)| < 1. Since F in eq. ( 1771 ) is linear in every c n (x n ), its extrema are at the 
limiting values c n (x n ) = ±1. In other words, a Bell-like inequality \F\ < MaxiF) = 
B(N) gives the upper fidelity bound. Note, that the above derivation shows that opti- 
mal classical protocols include one in which partners Pi to fV- 1 sen d to Pn one bit 
messages which encode the value of e k = ykc(x k ), where k~ 1,2, ...,N— 1. 



6.2 Quantum solutions 

The inequality for F suggests that some problems may have quantum solutions, 
which surpass any classical ones in their fidelity. Simply one may use an entangled 
state | y/) of N qubits that violates the inequality. Send to each of the partners one of 
the qubits. In a protocol run all N partners make measurements on the local qubits, 
the settings of which are determined by x k . They measure a certain qubit observable 
n k (x k ) ■ a. The measurement results y k — ±.\ are multiplied by y k , and the partner P k , 
for 1 < k < N — 1, sends a bit message to Pn encoding the value of m k = y k y k . The 
last partner calculates yNYNl\ k =i mk > announces this as A. The average fidelity 
of such a process is 

l 

F= £ g(x u ...,x N )(Y\^ =1 (n k (x k )-a k )\Y), (78) 

xi,...,x N =Q 

and in certain problems can even reach unity. 

For some tasks the quantum vs. classical fidelity ratio grows exponentially with 
N. This is the case, for example, for the so-called modulo-4 sum problem. Each 
partner receives a two-bit input string (X k = 0, 1,2,3; k = 1, . . . ,N). The promise 
is mate's are distributed such that (Y^ = iXk)mod2 = 0. The task isj: Pn must tell 
whether the sum modulo-4 of all inputs is or 2. 

For this problem the classical fidelity bounds decrease exponentially with Af, that 
is B(F) < 2- K+ \ where K = N/2 for even and K = (N + 1 )/2 for odd number of 
parties. If one uses the N qubit GHZ states: | GHZ) = ( \z+, . . . , z+) + \z- , ■ ■ ■ , z- )), 
where |z±) is the state of spin ±1 along the z-axis, and suitable pairs of local set- 
tings, the associated Bell inequality can be violated maximally. Thus, one has a 
quantum protocol which always gives the correct answer. 

In all quantum protocols considered here entanglement that leads to a violation of 
Bell's inequality is a resource that allows for better-than-classical efficiency of the 
protocol. Surprisingly, one can also show a version of a quantum protocol without 
entanglement 11361 [391 . The partners exchange a single qubit, P^ to P k +\ and so on, 
and each of them makes a suitable unitary transformation on it (which depends on z k 
and x k ). The partner Pn, who receives the qubit as the last one, additionally performs 
a dichotomic measurement. The result he/she gets is equal to T. For details, includ- 



4 It can be formulated in terms of a task function T — 1 — (£f =1 X(.)mod4. An alternative formula- 
tion of the problem reads / = cos(f £f =1 X*.) with p' = pw+rl cos (f Ef=i^/t)l- 
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ing an experimental realization see Ref. |09l . The obvious conceptual advantage of 
such a procedure is that the partners exchange a single qubit, from which due to 
the Holevo bound ||29l one can read out at most one bit of information. In contrast 
with the protocol involving entanglement, no classical transfer of any information 
is required, except from the announcement by fV of his measurement result! 

In summary, if one has a pure entangled state of many qubits (this can be gen- 
eralized to higher-dimensional systems and Bell's inequalities involving more than 
two measurement settings per observer), then there exist a Bell inequality which is 
violated by this state. This inequality has some coefficients g{x\ ,...,x n ), in front of 
correlation functions, which can always be renormalized in such a way that 

l 

£ \g(x u ...,x n )\ = 1. 

,v„=0 

The function g can always be interpreted as a product of the dichotomic func- 
tion f(xi,...,x n ) = |g[^' = il an d a probability distribution p'(xi,...,x n ) = 
\g(xi,...,x n )\. Thus we can construct a communication complexity problem that is 
tailored to a given Bell's inequality, with task function T = Ylf yif- All this can be 
extended beyond qubits, see Ref. 13711401 . 

As it was shown, for three or more parties, N >3, quantum solutions for certain 
communication complexity problems can achieve probabilities of success of unity. 
This is not the case for N = 2 and the problem based on the CHSH inequality. The 
maximum quantum value for the left hand side of the CHSH inequality d25l l is just 
\fl — 1 . This is much bigger than the Bell bound of 0, but still not the largest possible 
value, for an arbitrary theory that is not following local realism, which equals to 
1 . Because the maximum possible violation of the inequality is not attainable by 
quantum mechanics several questions arise. Is this limit forced by the theory of 
probability, or by physical laws? We will address this question in the next section, 
and look what would be the consequences of a maximal logically possible violation 
of the CHSH inequality. 



6.3 Stronger-than-quantum-correlations 

The Clauser-Horne-Shimony-Holt(CHSH) inequality [9 | for local realistic theories 
gives the upper bound on a certain combination of correlations between two space- 
like separated experiments. Consider Alice and Bob who independently perform 
one out of two measurements on their part of the system, such that in total there are 
four experimental set-ups: (x,y) = (0,0), (0, 1), (1,0) or (1,1). For any local hidden 
variable theory the CHSH inequality must hold. One can put it the following form: 



p(a = b\x = Q,y = 0) + p(a = b\x = 0,y = 0) 
+p(a = b\x = 0,y = 0) + p(a = -b\x = 0,y = 0) < 3, (79) 
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or equivalently, 

£ p(a®b = x-y) < 3. (80) 

x,y=QA 

In the latter form we interpret the dichtomic measurement results as of binary values, 
or 1, and their relations are put as 'modulo 2 sums', denoted here by 0. One has 
0©0= 1©1 = 0and0©l = 1. For example, p(a = b\x = 0,y = 0) is the probability 
that Alice's and Bob's outcomes are the same when she chooses setting x and he 
setting y. 

As discussed in previous sections quantum mechanical correlations can violate 
the local realistic bound of inequality (l80l and the limit was proven by Cirel'son l44l 
to be 2 + \fl. In Ref. JT9'| Popescu and Rohrlich asked why quantum mechanics al- 
lows a violation of the CHSH inequality with a value of 2 + y/2, but not more, 
though the maximal logically possible value is 4. Would a violation with a value 
larger than 2 + ^/2 lead to (superluminal) signaling?. If this were true, then quan- 
tum correlations could be understood as maximal allowed correlations respecting 
non-signaling requirement. This could give us an insight on the origin of quantum 
correlations, without any use of the Hilbert space formalism. 

The non-signaling condition is equivalent to the requirement that the marginals 
are independent of the partners choice of setting 

p(a\x,y)= p(a,b\x,y)=p(a\x), (81) 

fc=0,l 

p(a\x,y)= £ p(a,b\x,y) = p(a\x) (82) 

6=0,1 

where p(a,b\x,y) is the joint probability for outcomes a and b to occur given x and 
y are the choices of measurement settings, respectively and p(a\x) is the probability 
for outcome a given x is the choice of measurement setting. Popescu and Rohrlich 
constructed a toy-theory where the correlations reach the maximal algebraic value 
of 4 for left hand expression of the inequality d79l ), but are nevertheless not in con- 
tradiction with signaling. The probabilities in the toy model are given by 



p(a 


= 0,b = 


0M 


p(a 


= 1,6 = 


iky) 


p(a 


= l,b = 


0\x,y) 


p(a 


= 0,b = 


1M 



if xy G {00,01,10}, 



Indeed one has 



£ p{a®b = x-y)=A. (84) 

x,y=0,\ 

Van Dam 1431 and independently Cleve considered how plausible are stronger- 
than-quantum correlations from the point of view of communication complexity, 
which describes how much communication is needed to evaluate a function with 
distributed inputs. It was shown that the existence of correlations that maximally 
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violate the CHSH inequality would allow to perform all distributed computations 
(between two parties) of dichotomic functions with a communication constraint to 
just one bit. If one is ready to believe that nature should not allow for "easy life" 
concerning communication problems, this could be a reason why superstrong cor- 
relations are indeed not possible. 

Instead of superstrong correlations one usually speaks about a "nonlocal box" 
(NLB) or Popescu-Rohrlich (PR) box, as an imaginary device that takes as inputs x 
at Alices and y at Bobs side, and outputs a and b at respective sides, such that a®b = 
x-y. Quantum mechanical measurements on a maximally entangled state allow for a 
success probability of p = cos 2 5 = 2+ /^ ~ 0.854 at the game of simulating NLBs. 
Recently, it was shown that in any "world" in which it is possible to implement 
an approximation to the NLB, that works correctly with probability greater than 
3+ 6 v ^ = 90.8%, for all distributed computations of dichotomic functions with a one- 
bit communication constraint, one can find a protocol that gives always the correct 
values, Ref. 11461 . This bound is an improvement over van Dam's one, but still has a 
gap with respect to the bound imposed by quantum mechanics. 



6.3.1 Superstrong correlations trivializes communication complexity 

We shall present a proof that availability of a perfect NLB would allow for a solu- 
tion of a general communication complexity problem for a binary function, with an 
exchange of a single bit of information. The proof is due to van Dam |45ll . 

Consider a Boolean function / : {0, 1}" x {0, 1}, which has as inputs two n-bit 
strings x = (x\,...,x n ) and y = (yi,-»,y n )- Suppose that Alice receives the x string 
and Bob, who is separated from Alice, the y-string, and they are to determine the 
function value /(x,y) by communicating as little as possible. They have, however, 
NLBs as resources. 

First, let us notice that any dichotomic function /(x, y) can be rewritten as a finite 
summation: 

2" 

/(x,y) = EP;(x)<2i(y), (85) 

where P(x) are polynomials in x £ {0, 1} and Qj(y) = // • ■■■ -y\" are monomials in 
y,- £ {0, 1} with i\, £ {0, 1}. Note that the latter ones constitute an orthogonal 
basis in a 2" dimensional space. The decomposed function / is treated as a func- 
tion of y's, while the inputs x\,...,x n are considered as indices numbering functions 
/. Note that there are 2" different monomials. Alice can locally compute all the P, 
values by herself and likewise Bob can compute all Qj by himself. These values 
determine the settings of Alice and Bob that will be chosen in r'-th run of the exper- 
iment. Note that to this end they need in general exponentially many NLBs. Alice 
and Bob perform for every i £ {1, ...,2"} a measurement on the z'-th NLB in order 
to obtain without any communication a collection of bit values a,- and bj, with the 
property <X> b\ = P, (x)2, (y). Bob can add all his b\ to £ 2 =1 b\ values without re- 
quiring any information from Alice, and he can broadcast this single bit to Alice. 
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She, on her part, computes the sum of her a, to £? =1 a, and adds Bob's bit to it. The 
final result 

2" 2" 

J>i96i) = f>i«&(y) =/(x,y) (86) 
i=i (=i 

is the function value. Thus, superstrong correlations trivialize every communication 
complexity problem. 



7 The Kochen-Specker Theorem 



In previous sections we have seen, that tests of Bell's inequalities are not only theory 
independent tests of non-classicality, but also have applications in quantum informa- 
tion protocols. Examples are communication complexity problems 1381 . entangle- 
ment detection 11471 . security of key distribution |2], and quantum state discrimina- 
tion l48l . Thus entanglement which violates local realism can be seen as a resource 
for efficient information processing. Can quantum contextuality - the fact that quan- 
tum predictions disagree from the ones of non-contextual hidden-variable theories - 
also be seen as such a resource? We will give an affirmative answer to this question 
by considering explicit examples of a quantum game. 

The Kochen-Specker theorem is a "no go" theorem that proves a contradiction 
between predictions of quantum theory and those of non-contextual hidden vari- 
able theories. It was proved by Bell in 1966 [49] and independently by Kochen and 
Specker in 1967 [50 1. The non-contextual hidden-variable theories are based on the 
conjecture of the following three assumptions: 

1 . Realism: It is a model that allows one to use all variables A m (n) in the theoretical 
description of the experiment, where A m (n) gives the value of some observable 
A,„ which could be obtained if the knob setting were at positions m. The index n 
describes the entire experimental "context" in which A m is measured and is op- 
erationally defined through the positions of all other knob settings in the experi- 
ment, which are used to measure other observables jointly with A m . All A m («)'s 
are treated as perhaps unknown, but still fixed, (real) numbers, or variables for 
which a proper joint probability distribution can be defined. 

2. A r on- contextuality: The value assigned to an observable A m {n) of an individual 
system is independent of the experimental context n in which it is measured, 
in particular of any properties that are measured jointly with that property. This 
implies that A m (n) = A m for all contexts n. 

3. "Free will". The experimenter is free to choose the observable and its context. 
The choices are independent of the actual hidden values of A m 's, etc. 

Note that "non-contextuality" implies locality (i.e., non-contextuaily with respect 
to a remote context), but there is no implication other way round. One might have 
theories which are local, but locally non-contextual. 

It should be stressed that the local realistic and non-contextual theories provide 
us with predictions which can be tested experimentally, and which can be derived 
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without making any reference to quantum mechanics (though many derivations in 
the literature give exactly the opposite impression). In order to achieve this, it is im- 
portant to realize that predictions for noncontextual realistic theories can be derived 
in a completely operational way l53l . For concreteness, imagine that an observer 
wants to perform a measurement of an observable, say the square, 5^, of a spin com- 
ponent of a spin-1 particle along a certain direction n. There will be an experimental 
procedure for trying to do this as accurately as possible. We will refer to this proce- 
dure by saying that one sets the "control switch" of his/her apparatus to the position 
n. In all experiments that we will discuss only a finite number of different switch 
positions is required. By definition different switch positions are clearly distinguish- 
able for the observer, and the switch position is all he knows about. Therefore, in an 
operational sense the measured physical observable is entirely defined by the switch 
position. From the above definition it is clear that the same switch position can be 
chosen again and again in the course of an experiment. Notice that in such an ap- 
proach as described above, it does not matter which observable is "really" measured 
and to what precision. One just derives general predictions, provided that certain 
switch positions are chosen. 

In the original Kochen-Specker proof l50l . the observables that are considered 
are squares of components of the spin 1 along various directions. Such observ- 
ables have values 1 or 0, as the components themselves have values 1,0, or — 1. 
The squares of spin components , 5^ 2 and S„ along any three orthogonal direc- 
tions m, 112, and 113 can be measured jointly. Simply, the corresponding quantum 
operators commute with each other. In the framework of a hidden-variable theory 
one assigns to an individual system a set of numerical values, say +1,0,+1,... for the 
square of spin component along each direction S^, 5^ 2 , 5^,... that can be measured 
on the system. If any of the observables is chosen to be measured on the individual 
system, the result of the measurement would be the corresponding value. In a non- 
contextual hidden variable theory one has to assign to an observable, say , the 
same value independently of whether it is measured in an experimental procedure 
jointly as a part of some set {Sj; , S„ 2 , S^ 3 } or of some other set {5^ , S„ 4 , } of 
physical observables, where {111,112,113} and {111,114,115} are triads of orthogonal 
directions. Notice that within quantum theory some of the operators corresponding 
to the observables from the first set may not commute with some corresponding to 
the observables from the second set. 

The squares of spin components along orthogonal directions satisfy 

^+^+^=^+1) = 2 - (87) 

This is always so for a particle of spin 1 (s=l). This implies that for every measure- 
ment of three squares of mutually orthogonal spin components two of the results will 
be equal to one, and one of them will be equal to zero. The Kochen-Specker the- 
orem considers a set of triads of orthogonal directions {111,112,113}, {111,114,115},..., 
for which at least some of the directions have to appear in several of the triads. 
The statement of the theorem is that there are sets of directions for which it is not 
possible to give any assignment of l's and 0's to the directions consistent with the 
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constraint d87] i. The original theorem in ||50l used 117 vectors, but this has subse- 
quently been reduced to 33 vectors llBTl and 18 vectors ll52l . Mathematically the 
contradiction with quantum predictions has its origin in the fact that the classical 
structure of non-contextual hidden variable theories is represented by commuta- 
tive algebra, whereas quantum mechanical observables need not be commutative, 
making it impossible to embed the algebra of these observables in a commutative 
algebra. 

The disproof of noncontextually relies on the assumption that the same value is 
assigned to a given physical observable, S„, regardless with which two other ob- 
servables the experimenter chooses to measure it. In quantum theory the additional 
observables from one of those sets correspond to operators that do not commute 
with the operators corresponding to additional observables from the other set. As 
it was stressed in a masterly review on hidden variable theories by Mermin J54), 
Bell wrote (49) that "These different possibilities require different experimental ar- 
rangements; there is no a priori reason to believe that the results ... should be the 
same. The result of observation may reasonably depend not only on the state of the 
system (including hidden variables) but also on the complete disposition appara- 
tus." Nevertheless, as Bell himself showed, the disagreement between predictions 
of quantum mechanics and of the hidden-variables theories can be strengthened if 
non-contextuality is replaced by a much more compelling assumption of locality. 
Note that in Bohr's doctrine of the inseparability of the object and the measuring 
instrument, an observable is defined through the entire measurement procedure ap- 
plied to measure it. Within this doctrine one would not speak about measuring the 
same observable in different contexts, but rather about measuring entirely different 
maximal observables, and deriving from it the value of a degenerate observable. 
Note that Kochen-Specker argument necessarily involves degenerate observables. 
This is why it does not apply to single qubits. 



7.1 A Kochen-Specker Game 

We will now consider a quantum game which is based on the Kochen-Specker ar- 
gument strengthened by the locality condition (See Ref. [55]). We consider a pair 
of entangled spin 1 particles, which form a singlet state with total spin 0. A formal 
description of this state is given by 

If > = -^(|l>n| - l)n+ | - l)„|l)n- |0) n |0) n ), (88) 

where, for example, |l) n | — l) n is the state of the two particles with spin projection 
+1 for the first particle and spin projection -1 for the second particle 1 along the 
same direction n. It is important to note that this state is invariant under a change of 
the direction n. This implies that if the spin components for the two particles are 
measured along an arbitrary direction, however the same both sides, the sum of the 
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two local results is always zero. This is a direct consequence of the conservation of 
angular momentum. 

We now present the quantum game introduced in Ref. |56l . The requirement 
in the proof of the Kochen-Specker theorem can be formulated as the following 
problem in geometry. There exists an explicit set of vectors {ni, ...,n„,} in R 3 that 
cannot be colored in red (i.e., assign the value 1 to the spin squared component 
along that direction) or blue (i.e., assign the value 0) such that both of the following 
conditions hold: 

1. For every orthogonal pair of vectors ni and 112, they are not both colored red. 

2. For every mutually orthogonal triple of vectors n,, n ; , and n^, at least one of them 
is colored red. 

For example, the set of vectors can consist of 1 17 vectors from the original Kochen- 
Specker proof l50l . 33 vectors from Peres's proof or 18 vectors from Cabello's 
proof ||52l. 

The Kochen-Specker game employs the above sets of vectors. Consider two sep- 
arated parties, Alice and Bob. Alice receives a random triple of orthogonal vectors 
as her input and Bob receives a single vector randomly chosen from the triple as his 
input. Alice is asked to give a trit indicating which of her three vectors is assigned 
color 1 (implicitly, the other two vectors are assigned color 0). Bob outputs a bit as- 
signing a color to his vector. The requirement is that Alice and Bob assign the same 
color to the vector that they receive in common. Nevertheless, it is straightforward 
to show that the existence of a perfect classical strategy in which Alice and Bob 
can share classically correlated strings for this game would violate the reasoning 
used in the Kochen-Specker theorem. On the other hand, there is a perfect quantum 
strategy using the entangled state (l88l . If Alice and Bob share two particles in this 
state, Alice can perform a measurement of squared spin components pertaining to 
directions {n;,n 7 ',nt.}, which are equal to those of the three input vectors, and Bob 
measures squared spin component in direction n/ for his input. Then Bob's mea- 
surement will necessarily yield the same answer as the measurement by Alice along 
the same direction. 

Concluding this section we note that quantum contextuality is also closely related 
to quantum error correction ||57l . quantum key distribution 11581 . one-location quan- 
tum games l59l . and entanglement detection between internal degrees of freedom. 



7.2 Temporal Bell's Inequalities (Leggett-Garg Inequalities) 

In the last section we will consider one more basic information processing task, ran- 
dom access code problem. It can be solved with a quantum set-up with a higher effi- 
ciency than it is classically possible. We will show that the resource for better-than- 
classical efficiency is a violation of "temporal Bell's inequalities" - the inequali- 
ties that are satisfied by temporal correlations of certain class of hidden-variable 
theories. Instead of considering correlations between measurement results on dis- 
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tantly located physical systems, here we focus on one and the same physical system 
and analyze correlations between measurement outcomes at different times. The 
inequalities were first introduced by Leggett and Garg ||60l in the context of test- 
ing superspositions of macroscopically distinct quantum states. Since our aim here 
is different, we will look at general assumptions that allows us to derive temporal 
Bell's inequalities irrespectively of whether the object under consideration is macro- 
scopic or not. This is why our assumptions differ from the original ones of Ref. 11601 . 
Compare also Ref. ll65l l66l l67l 

We consider the theories which are based on the conjunction of the following 
four assumptions^: 

1. Realism: It is a model that allows one to use all variables A m (t) m = 1,2, ... in 
the theoretical description of the experiment performed at time ?, where A m (t) 
gives the value of some observable which could be obtained if it were measured 
at time t . All A m (t )'s are treated as perhaps unknown, but still fixed numbers, or 
variables for which a proper joint probability distribution can be defined. 

2. Non-invasiveness: The value assigned to an observable A m (t\ ) at time t\ is inde- 
pendent whether or not a measurement was performed at some earlier time to or 
which observable A„(to) n = 1,2, ... at that time was measured. In other words, 
(actual or potential) measurement values A m (t\) at time t \ are independent of the 
measurement settings chosen at earlier times to- 

3. Induction: The standard arrow of time is assumed. In particular, the values A m (to) 
at earlier times to do not depend on the choices of measurement settings at later 
times fJ3 

4. "Free will": The experimenter is free to choose the observable. The choices are 
independent of the actual hidden values of A's, etc. 

Consider an observer and allow her to choose at time fo and at some later time t \ 
to measure one of two dichotomic observables Ai(f,-) and A 2 (ti), i G {0, 1}. The as- 
sumptions given above imply existence of numbers for A\ (tj) and A 2 (tf), each taking 
values either +1 or -1, which describe the (potential or actual) predetermined result 
of the measurement. For the temporal correlations in an individual experimental run 
the following identity holds: Ai(f )[Ai(fi) -A 2 (t\)] +A 2 (t )[Ai(ti) +A 2 (h)] = ±2. 
With similar steps as in derivation of the standard Bell's inequalities, one easily 
obtains: 

p(AoA = l)+p(AoA l =-l)+p(A 1 Ao = l)+p(A 1 A 1 = l)<3, (89) 

where we omit the dependence on time. 

An important difference between quantum contextuality and temporal Bell's in- 
equalities is that later can also be tested on single qubits or two-dimensional quan- 

5 There is one more difference between the present approach and this of Ref. 1601 . While there the 
observer measures a single observable having a choice between different times of measurement, 
here at any given time the observer has a choice between two (or more) different measurement 
settings. One can use both approaches to derive temporal Bell' inequalities. 

6 Note that this already follows from the '"non-invasiveness" when applied symmetrically to both 
arrows of time. 
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turn systems. We will now calculate the temporal correlation function for consecu- 
tive measurements of a single qubit. Take an arbitrary mixed state of a qubit, written 
as p = i(l +r • a), where 1 is the identity operator, <7 = (cr x , £T y , O z ) are the Pauli 
operators for three orthogonal directions x, y and z, and r = (r x ,r y ,r z ) is the Bloch 
vector with the components r, =Tr(pOi). 

Suppose that the measurement of the observable a ■ a is performed at time 
fo, followed by the measurement of a ■ b at t\, where a and b are directions at 
which spin is measured. The quantum correlation function is given by £gM(a,b) = 
Yikl=±\ & ' ' ' Tr(p7r a ^) • Tr(n a ^n^j) , where, e.g., n a % is the projector onto the sub- 
space corresponding to the eigenvalue /c = ±l of the spin along a. Here we use 
the fact that after the first measurement the state is projected on the new state K^. 
Therefore, the probability to obtain the result k in the first measurement and I in 
the second one is given by Tr(p%jfc)Tr(7t a k^b,l)- Using 7T a £ = \{1 + ka ■ a) and 
jTr[((7 • a) (<7 • b)] = a • b one can easily show that the quantum correlation function 
can simply be written as 

£ eM (a,b)=a-b. (90) 

Note that in contrast to the usual correlation function the temporal one d90b does not 
dependent of the initial state p. Note also that a slight modification of our derivation 
of Eq. d90b can also apply to the cases in which the system evolves between the two 
measurements following an arbitrary unitary transformation. 

The scalar product form of quantum correlations ( f90b allows for the violation of 
the temporal Bell inequality and the maximal value of the left-hand side of d89b is 
achieved for the choice of the measurement settings: ai = -j= (b\ — b2), a2 = ^= (bi + 

b2) and is equal to 2 + \/2. 



7.3 Quantum Random Access Codes 

Random access code is a communication task for two parties, whom we call again 
Alice and Bob. Alice receives some classical n-bit string known only to her (her 
local input). She is allowed to send just a one bit message, m, to Bob. Bob is asked 
to tell the value of the /3-th bit of Alice, b = 1,2..., 77. However b is known only to 
him (this is his local input data). The goal is to construct a protocol enabling Bob 
to tell the value /3-th bit of Alice, with as high average probability of success as 
possible, for a uniformly random distribution of Alices bit-strings, and a uniform 
distribution of b's. Note that, since Alice does not know in advance which bit Bob 
is to recover. Thus she has no option to send just this required bit. 

If they share a quantum channel then one speaks about a quantum version of the 
previous problem. Alice is asked to encode her classical n-bit message into 1 qubit 
(quantum bit) and send it to Bob. He performs some measurement on the received 
qubit to extract the required bit. In general, the measurement that he uses will depend 
on which bit he wants to reveal. The idea behind these so-called quantum random 
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access codes already appeared in a paper written circa 1970 and published in 1983 
by Stephen Wiesner |63l . 

We illustrate the concept of random access code with the simplest scheme, in 
which in a classical framework Alice needs to encode a two-bit string b()b\ into a 
single bit, or into a single qubit in a quantum framework. 

In the classical case Alice and Bob need to decide on a protocol defining which 
bit-valued message is to be sent by Alice, for each of the four possible values of her 
two-bit string bobi. There are only 2 4 = 16 different deterministic protocols, thus 
the probability of success can be evaluated in a straightforward way. The optimal 
deterministic classical protocols can then be shown to have a probability of success 
Pc = 3/4. For example, if Alice sends one of the two bits, then Bob will reveal 
this bit with certainty and have probability of 1/2 to reveal the other one. Since 
any probabilistic protocol can be represented as a convex combination of the 16 
deterministic protocols, the corresponding probability of success for any such prob- 
abilistic protocol will be given by the weighted sum of the probabilities of success 
of the individual deterministic protocols. This implies that the optimal probabilistic 
protocols can at best be as efficient as the optimal deterministic protocol, which is 



Ambainis et al. 11641 showed that there is a quantum solution of the random ac- 
cess code with probability of success Pq = cos 2 (7r/8) « 0.85. It is realized as fol- 
lows: depending on her two-bit string bob\, Alice prepares one of the four states 
| \l/b b l ) ■ These states are chosen to be on the equator of the Bloch sphere, separated 
by equal angles of n/2 radians (see figure 3). Using the Bloch sphere parametriza- 
tion \y(0,^>)) = cos(0/2)|O) +exp(/0)sin(0/2)|l), the four encoding states are 
represented as: 



Bob's measurements, which he uses to guess the bits, will depend on which bit he 
wants to obtain. To guess bo, he projects the qubit along the x-axis in the Bloch 
sphere, and to decode b\ he projects it along the y-axis. He then estimates the bit 
value to be if the measurement outcome was along the positive direction of the 
axis and 1 if it was along the negative axis. It can easily be calculated that the 
probability of successful retrieving of the correct bit value is the same in all cases: 
Pq = cos 2 (7r/8) w 0.85, which is higher than the optimal probability of success 
Pc = 0.75 of the classical random access code using one bit of communication. 

We will now introduce a hidden variable model of the quantum solution to see 
that the key resource in its efficiency lies in violation of temporal Bell's inequalities. 
Galvao |6T| was the first to point to the relation between violation of Bell's type 
inequalities and quantum random access codes. See also Ref. l62l for a relation 
with the parity-oblivious multiplexing. 



3/4. 



IVoo) 
I Voi) 
IVio) 
\Wn) 



\ W (k/2,k/4)), 
\y(n/2,7x/4)}, 
\y(n/2,3n/4)}, 
\w(K/2,5n/4)). 



(91) 



Bell's Inequalities: Foundations and Quantum Communication 



39 



Fig. 3 The set of encoding 
states and decoding measure- 
ments in quantum random 
access code represented in 
the x — y plane of the Bloch 
sphere. Alice prepares one 
of the four quantum states 
Wb bi to encode two bits 
&o,6i 6 {0, 1}. Depending 
on which bit Bob wants to 
reveal he performs either a 
measurement along the x (to 
reveal bo) or along the y axis 
(to reveal b\). 

A hidden-variable model equivalent to the quantum protocol, which best fits the 
temporal Bell's inequalities can be put as a description of the following modification 
of the original quantum protocol. Alice prepares the initial state of her qubit as 
a completely random state, described by a density matrix proportional to the unit 
operator, On. Her parity of bit values bo © b\ defines a measurement basis, which 
is used by her to prepare the state to be sent to Bob. Note that the result of the 
dichotomic measurement in the basis defined by bo © b\ is, due to the nature of 
the initial state, completely random, and totally uncontrollable by Alice. To fix the 
bit value b\ (and thus also the value bo, since the parity is defined by the choice 
of the measurement basis) on her wish, Alice either leaves the state unchanged, if 
the result of measurement corresponds to her wish of b\ or she rotates the state in 
the x — y plane at 180° to obtain the orthogonal state, if the result corresponds to 
b\ © 1. Just a glance at the states involved in the standard quantum protocol shows 
what are the two complementary (unbiased) bases which define her measurement 
settings, and which resulting states are linked with which values of bob\. After the 
measurement the resulting state is sent to Bob, while Alice is in possession of a bit 
pair bob\ , which is perfectly correlated with the qubit state on the way to Bob. That 
is, we have exactly the same starting point as in the original quantum protocol. 

Now, it is obvious that the quantum protocol violates the temporal inequalities, 
while any hidden variable model of the above procedure, using the four assumptions 
(1.-4.) behind the temporal inequalities is not violating them. What is important the 
saturation of the temporal inequalities is equivalent to a probability of success of 
3/4. 

The link with temporal Bell's inequalities points onto another advantage of quan- 
tum over classical random access codes. Usually, one considers the advantage to be 
only resource dependent. With this we mean that there is an advantage as far as one 
compares one classical bit with one qubit. Yet, the proof given above shows that 
quantum strategy has an advantage over all hidden variable models respecting (1.- 
4.), i.e. also those where Alice and Bob use systems with arbitrarily large number 
of degrees of freedom. 

Concluding this section and the Chapter we would like to point onto an interest- 
ing research avenue. Here we gave a brief review on the results demonstrating that 
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"no go theorems" for various hidden variable classes of theories, are behind better- 
than-classical efficiency in many quantum communication protocols. It would be 
interesting to investigate the link between fundamental features of quantum me- 
chanics and the power of quantum computation. It has been shown that temporal 
Bell's inequalities distinguish between classical and quantum search (Grover) algo- 
rithm lf68l . Cluster states - a resource for measurement-based quantum computation 
(also known as "one-way" quantum computation) in which information is processed 
by a sequence of adaptive single-qubit measurements on the state - are shown to 
violate Bell's inequalities |69l l70l . Similarly, the CSHS and GHZ problems are 
shown to be closely related to measurement-based classical computation, as does 
the Popescu-Rohrlich box iTTTl . These results point on the aforementioned link but 
we are still far away from understanding what are the key non-classical ingredients 
that give rise to the enhanced quantum computational power. The question gets even 
more fascinating after realizing that not only too low ifTTl I72ll73l l74l I75ll76l but also 
too much entanglement does not allow powerful quantum computation 11771 l78l . 
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